Learning 2-Opt Heuristics for Routing Problems via Deep Reinforcement Learning
نویسندگان
چکیده
Abstract Recent works using deep learning to solve routing problems such as the traveling salesman problem (TSP) have focused on construction heuristics. Such approaches find good quality solutions but require additional procedures beam search and sampling improve achieve state-of-the-art performance. However, few studies improvement heuristics, where a given solution is improved until reaching near-optimal one. In this work, we propose learn local heuristic based 2-opt operators via reinforcement learning. We policy gradient algorithm stochastic that selects operations current solution. Moreover, introduce neural network leverages pointing attention mechanism, which can be easily extended more general k -opt moves. Our results show learned policies even over random initial approach faster than previous methods for TSP. also adapt proposed method two extensions of TSP: multiple TSP Vehicle Routing Problem, achieving par with classical heuristics methods.
منابع مشابه
Shared Autonomy via Deep Reinforcement Learning
In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal. The goal is often unknown ex-ante, so prior work enables agents to infer the goal from user input and assist with the task. Such methods tend to assume some combination of knowledge of the dynamics of the environment, the user’s policy given their goal, and the set of possible goals the user might ...
متن کاملStrategic Dialogue Management via Deep Reinforcement Learning
Artificially intelligent agents equipped with strategic skills that can negotiate during their interactions with other natural or artificial agents are still underdeveloped. This paper describes a successful application of Deep Reinforcement Learning (DRL) for training intelligent agents with strategic conversational skills, in a situated dialogue setting. Previous studies have modelled the beh...
متن کاملInverse Reinforcement Learning via Deep Gaussian Process
We propose a new approach to inverse reinforcement learning (IRL) based on the deep Gaussian process (deep GP) model, which is capable of learning complicated reward structures with few demonstrations. Our model stacks multiple latent GP layers to learn abstract representations of the state feature space, which is linked to the demonstrations through the Maximum Entropy learning framework. Inco...
متن کاملRL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning
Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we p...
متن کاملDeep Reinforcement Learning for Solving the Vehicle Routing Problem
We present an end-to-end framework for solving Vehicle Routing Problem (VRP) using deep reinforcement learning. In this approach, we train a single model that finds near-optimal solutions for problem instances sampled from a given distribution, only by observing the reward signals and following feasibility rules. Our model represents a parameterized stochastic policy, and by applying a policy g...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: SN computer science
سال: 2021
ISSN: ['2661-8907', '2662-995X']
DOI: https://doi.org/10.1007/s42979-021-00779-2